Bayesian Learning of Gaussian Mixture Densities for Hidden Markov Models

نویسندگان

  • Jean-Luc Gauvain
  • Chin-Hui Lee
چکیده

An investigation into the use of Bayesian learning of the parameters of a multivariate Gaassian mixture density has been carried out. In a continuous density hidden Markov model (CDHMM) framework, Bayesian learning serves as a unified approach for parameter smoothing, speaker adaptation, speaker clustering, and corrective training. The goal of this study is to enhance model robustness in a CDHMMbased speech recognition system so as to improve performance. Our approach is to use Bayesian learning to incorporate prior knowledge into the CDHMM training process in the form of prior densities of the HMM parameters. The theoretical basis for this procedure is presented and preliminary results applying to HMM parameter smoothing, speaker adaptation, and speaker clustering are given. Performance improvements were observed on tests using the DARPA RM task. For speaker adaptation, under a supervised learning mode with 2 minutes of speaker-specific training data, a 31% reduction in word error rate was obtained compared to speaker-lndependent results. Using Baysesian learning for HMM parameter smoothing and sex-dependent modeling, a 21% error reduction was observed on the FEB91 test. I N T R O D U C T I O N When training sub-word units foi" continuous speech recognition using probabilistic methods, we are faced with the general problem of sparse training data. This limits the effectiveness of conventional maximum likelihood approaches. The sparse training data problem can not always be solved by the acquisition of more training data. For example, in the case of rapid adaptat ion to new speakers or environments, the amount of data available for adapta t ion is usually much less than what is needed to achieve good performance for speaker-dependent applications. Techniques used to alleviate the insufficient training data problem include probability density ruction (pdf) smoothing, model interpolation, corrective training, and parameter sharing. The first three techniques have been developed for HMM with discrete pdfs and cannot be directly extended to the general case of continuous density hidden Markov model (CDHMM). For example, the classical scheme of model interpolation [4] [9] can be applied to CDHMM only if tied mixture HMMs or an increased number of mixture components are used. Our solution to the problem is to use Bayesian learning to incorporate prior knowledge into the CDHMM training process. 1Jeaa-Luc Gauvain is on leave from the Speech Commur, ication Group at LIMSI/CNRS, Orsay, France. The prior information consists of prior densities of the HMM parameters. Such an approach was shown to be effective for speaker adaptat ion in isolated word recognition of a 39-word, English alpha-digit vocabulary where adapta t ion involved only the parameters of a multivariate Gaussian state observation density of whole-word HMM's [12]. In this paper, Bayesian adapta t ion is extended to handle parameters of mixtures of Gaussian densities. The theoretical basis for Bayesian learning of parameters of a multivariate Gaussian mixture density for HMM is developed. In a CDHMM framework, Bayesian learning serves as a unified approach for parameter smoothing, speaker adaptat ion, speaker clustering, and corrective training. In the case of speaker adaptat ion, Bayesian learning may be viewed as a process for adjusting speaker-independent (SI) models to form speaker-specific ones based on the available prior information and a small amount of speaker-specific adapta t ion data. The prior densities are simultaneously est imated during the SI training process along with the estimation of the SI model parameters. The joint prior density for the parameters in a s tate is assumed to be a product of normal-gamma densities for the mean and variance parameters of the mixture Gaussian components and a Diriehlet density for the mixture gain parameters. The SI models are used to initialize the iterative adapta t ion process. The speaker-specific models are derived from the adapta t ion data using a segmental M A P algorithm which uses the Viterbi algori thm to segment the data and an EM algori thm to est imate the mode of the posterior density. In the next section the principle of Bayesian learning for CDHMM is presented. The remaining sections report preliminary results obtained for model smoothing, speaker adapta t ion and sex-dependent modeling. M A P E S T I M A T E OF C D H M M The difference between maximum likelihood (ML) estimation and Bayesian learning lies in the assumption of an appropriate prior distribution of the parameters to be estimated. If 0 is the parameter vector to be estimated from a sequence of n observations xl , ...,x•, given a prior density P(8), then one way to estimate 0 is to use the maximum a posteriori (MAP) est imate which corresponds to the mode of the posterior density P(01xl, ..., xn), i.e.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains

In this paper a framework for maximum a posteriori (MAP) estimation of hidden Markov models (HMM) is presented. Three key issues of MAP estimation, namely the choice of prior distribution family, the specification of the parameters of prior densities and the evaluation of the MAP estimates, are addressed. Using HMMs with Gaussian mixture state observation densities as an example, it is assumed ...

متن کامل

­­Image Segmentation using Gaussian Mixture Model

Abstract: Stochastic models such as mixture models, graphical models, Markov random fields and hidden Markov models have key role in probabilistic data analysis. In this paper, we used Gaussian mixture model to the pixels of an image. The parameters of the model were estimated by EM-algorithm.   In addition pixel labeling corresponded to each pixel of true image was made by Bayes rule. In fact,...

متن کامل

IMAGE SEGMENTATION USING GAUSSIAN MIXTURE MODEL

  Stochastic models such as mixture models, graphical models, Markov random fields and hidden Markov models have key role in probabilistic data analysis. In this paper, we have learned Gaussian mixture model to the pixels of an image. The parameters of the model have estimated by EM-algorithm.   In addition pixel labeling corresponded to each pixel of true image is made by Bayes rule. In fact, ...

متن کامل

Speech Enhancement Using Gaussian Mixture Models, Explicit Bayesian Estimation and Wiener Filtering

Gaussian Mixture Models (GMMs) of power spectral densities of speech and noise are used with explicit Bayesian estimations in Wiener filtering of noisy speech. No assumption is made on the nature or stationarity of the noise. No voice activity detection (VAD) or any other means is employed to estimate the input SNR. The GMM mean vectors are used to form sets of over-determined system of equatio...

متن کامل

Self-organization in mixture densities of HMM based speech recognition

In this paper experiments are presented to apply Self-Organizing Map (SOM) and Learning Vector Quantization (LVQ) for training mixture density hidden Markov models (HMMs) in automatic speech recognition. The decoding of spoken words into text is made using speaker dependent, but vocabulary and context independent phoneme HMMs. Each HMM has a set of states and the output density of each state is...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1991